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We report a measurement of the top quark mass Mt in the dilepton decay channel tt -^ 
b(''^u'(b(.~T>i. Events are selected with a neural network which has been directly optimized for statis- 
tical precision in top quark mass using neuroevolution, a technique modeled on biological evolution. 
The top quark mass is extracted from per-event probability densities that are formed by the convolu- 
tion of leading order matrix elements and detector resolution functions. The joint probability is the 
product of the probability densities from 344 candidate events in 2.0 fb~^ of pp collisions collected 
with the CDF II detector, yielding a measurement of Mt = 171.2 ± 2.7(stat.) ± 2.9(syst.) GeV/c^ 

PACS numbers: 14.65.Ha, 13.85.Ni, 13.85.Qk, 12.15.Ff 
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Over ten years after the discovery of the top quark, 
its mass, M(, remains a quantity of great interest. Mt- 
dependent terms contribute to radiative corrections to 
precision electroweak observables, thus providing infor- 
mation on the unobserved Higgs boson [T] and other par- 
ticles in possible extensions to the standard model [5] 
(SM). Top quarks are produced only at the Fermilab 
Tevatron, primarily in pairs and decay « 100% to d. W 
boson and a b quark, ti -> W+bW~b, in the SM. The 
dilepton channel, where both W bosons decay to charged 
leptons (electrons and muons, including leptonic decays 
of T leptons) and neutrinos, has the smallest branching 
fraction, but also has the least number of hadronic jets in 
the final state and hence a smaller sensitivity to their en- 
ergy calibration. Significant differences in the measure- 
ments of Mt in different decay channels could indicate 
contributions from sources beyond the SM 3 . 

Reconstruction of Mt in the dilepton channel presents 
unique challenges, as the two neutrinos in the final 
state result in a kincmatically underconstrained system. 
We utilize a likelihood-based estimator that convolutes 
leading order SM matrix elements and detector resolu- 
tion functions and integrates over unmeasured quanti- 
ties. Prior applications of this method to dilepton events 
have yielded the most precise measurements of Mt in this 
channel [H |SJ [^ . These prior measurements utilize event 
selection criteria that were designed to maximize signal 
purity for a measurement of the ti production cross sec- 
tion [7]. The selection optimization for precision in Mt 
is hampered by the difficulty of searching the space of 
arbitrary multivariate selections. Well established multi- 
variate algorithms such as neural networks are typically 
limited to minimization of a specific metric, such as mis- 
classification error. They are not designed to optimize an 
event ensemble property, such as the uncertainty on the 
top quark mass. In contrast, the technique of neuroevo- 
lution [H] combines the parametrization of an abitrary 
multivariate selection described by a neural network with 
an evolutionary minimization approach to search for the 
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network weights and topology which optimizes an arbi- 
trary metric. In this Letter, we present a measurement 
using an improved matrix element analysis technique and 
an event selection optimized with neuroevolution to mini- 
mize the expected statistical uncertainty in the top quark 
mass measurement. We utilize 2.0 fb~^ of data collected 
between March 2002 and May 2007 with the CDF II de- 
tector at the Fermilab Tevatron. 

CDF II |9l Unj [11] contains a charged particle track- 
ing system consisting of a silicon microstrip tracker and 
a drift chamber immersed in a 1.4 T magnetic field. 
Surrounding electromagnetic and hadronic calorimeters 
measure particle energies. Outside the calorimeters, drift 
chambers and scintillators detect muons. 

We use lepton triggers that require an electron or muon 
with px > 18 GeV/c. We define a preselection which 
satisfies the basic signature of top dilepton decay: two 
oppositely charged leptons with pT > 20 GeV/c, two 
or more jets with Et > 15 GeV [12] within the region 
\ri\ < 2.5, ^T > 20 GeV p^, and dilepton invariant mass 
Ma > 10 GeV/c^. Suppression of the Z ^f II background 
is performed by the subsequent neural-network selection. 

Neuroevolution, an approach modeled on biological 
evolution, is used to search directly for the optimal neu- 
ral network. Beginning with a population of 150 net- 
works with random weights, the statistical precision of 
Mt is evaluated for each network by performing experi- 
ments using the simulated signal and background events 
which survive a threshold requirement on the network 
output. The events are simulated using the pythia [14] 
and ALPGEN [15] generators and a full detector simu- 
lation |16j . Poorly performing networks are culled and 
the 30 strongest performers are bred together and mu- 
tated in successive generations until performance reaches 
a plateau in a statistically independent pool of events, 
which occurs after 15 generations. The statistical uncer- 
tainty obtained from the best performer in each genera- 
tion is shown in Fig. [Ha). In the context of an arbitrary 
but a priori fixed choice of network threshold, the net- 
works evolve to optimize the selection regardless of the 
threshold's value. Because we have optimized directly on 
the final statistical precision rather than some intermedi- 
ate or approximate figure of merit, the best-performing 
network is the one which gives the most precise mea- 
surement . This approach has been shown to significantly 
outperform traditional methods in event selection [17 . In 
particular, we use neuroevolution of augmenting topolo- 
gies (NEAT) [T5], a neuroevolutionary method capable 
of evolving a network's topology and weights. 

Some of the events passing this selection have sec- 
ondary vertex tags [19] . which enhance b-quark fraction 
and thus signal purity. We exploit this enhancement by 
separately fitting events with and without secondary ver- 
tex tags, and combining the fits. The predicted number 
of signal and background events is shown in Table |T] Us- 
ing the optimized selection improves the a priori statisti- 
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TABLE I: Expected sample composition after neural network 
selection for events with and without secondary vertex tags. 
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FIG. 1; Top, expected statistical uncertainty for the best 
network in each successive generation of network evaluation. 
The points show the average performance for each generation; 
the error bars show the variation due to the randomly gener- 
ated networks in generation 0. Bottom, expected statistical 
uncertainty on Mt versus signal fraction after neural network 
selection, for all evaluated networks. The selection [7] used 
in previous measurements is shown (•) for comparison. The 
arrows show the expected statistical uncertainty and signal 
fraction corresponding to the network used in the analysis. 
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FIG. 2: The output of the final network evaluated on the 
collected data (black triangles), with expected signal and 
background contributions (stacked solid histograms). The 
data show events passing the pre-selection. The evolution 
of the optimum selection network is performed with an a pri- 
ori threshold set at 0.5 for candidate selection. Of the 642 
pre-selected events shown, 344 events pass this threshold and 
constitute the final candidate sample for mass-fitting. 



cal uncertainty on Mt over the selection used in previous 
analyses [6] by 20%. This neural network selection yields 
344 candidate events (Fig. [2| . Strikingly, the sample se- 
lected by the neural network is expected to be dominated 
by background events; the resulting measurement is ex- 
pected to be more precise than previous measurements 



Source 



JV(O-tag) JV(> 1-tag) 



Z^ll 116.5 ±18.6 4.1 ±1.8 

Z^ll + cc/bb 9.3 ±1.4 10.1 ±4.0 

WW,WZ,ZZ,W-f 17.3 ±5.9 0.7 ±0.7 

Misidentified leptons 29.0 ± 8.7 4.5 ± 1.1 

ti (a = 6.7 pb. Ah = 175 GcV/c^) 43.8 ± 4.4 78.0 ± 6.2 



Total 

Observed (2.0 fb"^) 



215.8 ±21.9 97.5 ±7.2 
246 98 



due to the increase in ti acceptance and the suppression 
of background effects as described below. The distribu- 
tion of expected statistical uncertainty versus signal pu- 
rity for all evaluated networks can be seen in Fig. Iljb). 

We express the probability density for the observed 
lepton and jet measurements, x^, as a function of the top 
quark mass Mt as Ps(xi|Mt). We calculate Ps{'x.i\Mt) 
using the theoretical description of the tt production 
and decay process with respect to x^, Ps(xi|Mj) — 
[l/a{Mt)][da{Mt)/d-Xi], where ^ is the differential cross 
section ^U\ [5TJ [52] and a is the total cross section. The 
term l/a{Mt) ensures that the probability density satis- 
fies the normalization condition, / dx^ Ps{'x.i\Mt) — 1. 

We evaluate Ps(xj|Mt) [5] by integrating over quanti- 
ties that are not directly measured, such as neutrino mo- 
menta and quark energies. The effect of simplifying as- 
sumptions is estimated using simulated experiments. We 
integrate over quark energies using a parameterized de- 
tector transfer function [5] W{p,j), defined as the prob- 
ability of measuring jet energy j given quark energy p. 

We account for backgrounds using their probability 
densities Pbg^iiii) and form the full per-event probability 



P"(x,|Af,) = P.(x,|Af,)p: ± E Ag.(x,)p,V.- (1) 



The functions Pbg,. (x^) are calculated using the differen- 
tial cross-section for each background. The proportions 
p" and p]!; depend on whether the event has n sec- 
ondary vertex tags, and are obtained from Table |T] We 
evaluate background probability densities for: Z/j*{^ 
ee,/i/.t)+jets, W+ > 3 jets where a jet is misidentified as 
a lepton, and VFVF+jets. Probability densities for smaller 
backgrounds {WZ, ZZ, VF7, and Z — > tt) provide neg- 
ligible gain in sensitivity and are not modeled. 

The posterior joint probability for the sample is the 
product of the per-event probability densities, 

P(x|Af,)- [n^"(x.o|Mt)] X [l[P^H^n\Mt)] (2) 



over all untagged (zq) and tagged («i) events. The mea- 
sured mass Mt is taken as the mean {Mt) computed us- 
ing the posterior probability, and the measured statistical 
uncertainty AMj is taken as the standard deviation. 



The response of our method for simulated experiments 
(Fig. Isk) is consistent with a hnear dependence on the 
true top mass. Its slope is less than unity due to the 
presence of unmodeled background. We derive correc- 
tions, Mt -^ 175.0 GeV/c2 + (M* - 171.0 GeV/c2)/0.86 
and AMt -^ AMt/0.86, from this response and apply 
them to the measured quantities in data. 
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FIG. 3: (a) Mean measured Mt in simulated experiments 
versus top quark masses. The solid line is a linear fit to the 
points, (h) Pull widths from simulated experiments versus top 
quark masses. The solid line is the average over all points. 



From the pull distribution of our simulated experi- 
ments, we find that AAf* is underestimated (Fig. pb). 
This is due to simplifying assumptions made in the prob- 
ability calculations for computational tractability [5J. 
These assumptions are violated in small, well-understood 
ways in realistic events. We scale AAft by an addi- 
tional factor, S = 1.16, derived from our simulated ex- 
periments. Applying this method to the 344 candidate 
events, we measure Mt = 171.2±2.7(stat.) GeV/c^ The 
posterior probability is Gaussian within the statistical 
accuracy of the Monte Carlo integration. 

There are several sources of systematic uncertainty in 
our measurement, which are summarized in Table III] The 
single largest source of systematic error comes from the 
uncertainty in the jet energy scale, which we estimate 



TABLE II: Summary of systematic uncertainties on the mea- 
sured top quark mass. 



Source 



Size (GeV/c^ 



Generic jet energy scale 

6- Jet Energy Scale 

In-time pileup 

Generator 

PDFs 

Background statistics 

Radiation 

Response correction 

Sample composition uncertainty 

Background modeling 

Lepton energy scale 



2.5 
0.4 
0.2 
0.9 
0.6 
0.5 
0.5 
0.4 
0.3 
0.2 
0.1 



Total 



2.9 



to be 2.5 GeV/c^ by varying the scale within its uncer- 
tainty |23j . An uncertainty specific to jets resulting from 
h partons contributes 0.4 GeV/c^ while in-time pileup 
contributes 0.2 GeV/c^. Uncertainty due to the Monte 
Carlo generator used for tt events is estimated as the 
difference in Mt extracted from PYTHIA events and HER- 
WIG [U] events and amounts to 0.9 GeV/c^. Uncertain- 
ties due to PDFs are estimated using different PDF sets 
(cteqSl [13 vs. MRST72 US]), different values of Agc^,, 
and varying the eigenvectors of the CTEQ6m |5S] set; the 
quadrature sum of the latter two (dominant) uncertain- 
ties is 0.6 GeV/c^. The limited number of background 
events available for simulated experiments results in an 
uncertainty on the shape of the background distributions, 
which yields an uncertainty on Mt of 0.5 GeV/c^. Uncer- 
tainty due to imperfect modeling of initial and final state 
QCD radiation (ISR and FSR, respectively) is estimated 
by varying the amounts of ISR and FSR in simulated 
events 27J and is estimated to be 0.5 GeV/c^. The un- 
certainty in the mass due to uncertainties in the response 
correction is evaluated by varying the response within the 
uncertainties shown in Fig. ^ and is 0.4 GeV/c^. The 
contribution from uncertainties in background composi- 
tion is estimated by varying the background normaliza- 
tions from TableUwithin their uncertainties and amounts 
to 0.3 GeV/c^. We estimate the uncertainty coming from 
modeling of the missing tranverse energy in Zj^* events 
and the uncertainty in the data-derived model of misin- 
dentified leptons to be 0.2 GeV/c^. The uncertainty 
in the lepton energy scale contributes an uncertainty of 
0.1 GeV/c^ to our measurement. Adding in quadrature 
yields a total systematic uncertainty of 2.9 GeV/c'^. 

In summary, we have presented a new measurement of 
the top quark mass in the dilepton channel. We have ap- 
plied the technique of neuroe volution, for the first time 
in particle physics, to devise an event selection crite- 
rion which optimizes statistical precision. We measure 
Mt = 171.2 ± 2.7(stat.) ± 2.9(syst.) GeV/c^. This is the 
single most precise measurement of Mt in this channel to 
date, is in good agreement with measurements in other 
channels |28l [29] , and represents a ~30% improvement in 
statistical precision over the previously published mea- 
surements in this channel [SI (SUl [SI] ■ 
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